Exact solution of a two-type branching process: Models of tumor progression 
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An explicit solution for a general two-type birth-death branching process with one way mutation 
is presented. This continuous time process mimics the evolution of resistance to treatment, or the 
onset of an extra driver mutation during tumor progression. We obtain the exact generating function 
of the process at arbitrary times, and derive various large time scaling limits. In the simultaneous 
small mutation rate and large time scaling limit, the distribution of the mutant cells develops some 
atypical properties, including a power law tail and diverging average. 



I. INTRODUCTION 

Mathematical modeling has a long history in cancer 
research [2 . Stochastic models helped to establish the 
concept of multiple mutations in tumor progression [3], 
and led to the understanding of tumor suppressor genes 
m [S] . These models described the transitions between 
various stages of cancer [SI [7] in an effective way, without 
considering population genetics. A more "microscopic" 
approach is to model the stochastic evolution of a pop- 
ulation of individual cells [S]- The simplest such models 
are branching processes [10], with countless biologi- 
cal applications (TTJ [12] ranging from bacterial evolution 
[131 [13] to cell homeostasis [13]. Branching processes have 
also been used to model several aspects of tumor progres- 
sion [ii[niiiMi]- 

Can, however, simple branching processes provide a 
quantitative description of such complex biological phe- 
nomena as homeostasis - the maintenance of healthy 
tissues? A promising positive answer appeared in |22j . 
where inducible genetic labeling was used to analyze the 
stochastic fate of the progenies of a single initial marked 
cell in the basal layer of epidermis in mice. The prob- 
abilities of finding clones of any given sizes at various 
times were fit remarkably well by a simple constant rate 
two-type branching process of progenitor (A) and post- 
mitotic cells (C). The scheme of the process is as follows 
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This model seemed unsolvable [22] in the sense that 
there was no known explicit formula for the probability 
Pm,n{t) of finding m number of A cells and n number 
of C cells for any finite time. We showed in [23] how 
to obtain an exact expression for the generating function 
of Pm.n{t)- Once the generating function is known, the 
probability Pm,n{t) can be efficiently obtained numeri- 
cally by fast Fourier transformation. 

The simplicity of the above model lies in the trivial 
behavior of C cells - which can only die but not prolif- 
erate - and in the criticality of the process, that is the 
symmetry of birth and death rates of A cells. Is it possi- 
ble to solve more general two type birth-death processes 



explicitly where the second cell type can both die and 
proliferate? In this paper we present an explicit solution 
of such a general continuous time birth-death process, 
proposed by Kendall [16] to model the onset of muta- 
tions during tumor progression. In this model, A cells 
can mutate irreversibly into B cells according to 



A- 

A- 

A 

B 

B 



AA 



B 

BB 



rate ai 
rate fix 
rate v 
rate a2 
rate (i2 



(2) 



We calculate the generating function that encapsulates 
the probability distribution Pm,n{t) for finding m copies 
of A and n copies of B cells at time t. Without loss of 
generality, we set the division rate of A cells to one, ai = 
1, since this can always be achieved by simply rescaling 
time. We shall use the following shorthand notations for 
the rate differences 



Ai = 1 - /3i 



X2 ~ a2 — P2 



(3) 



which can be considered as the fitness values of the cor- 
responding cell types. Note that there is no restriction 
on the signs of parameters Ai and A2. 

There are several possible applications of model ([2|. 
It can model the onset of a new mutation in an evolving 
population of progenitor cells (e.g. the A cells of process 
([1])) of a healthy tissue. The new mutation can be either 
a neutral passenger (A2 = 0), or an advantageous driver 
(A2 > 0) [Ml [23- Model ([2]) can also describe the onset 
of a chemotherapy resistant mutation in cancer jl8i i2T] , 
corresponding to the case Ai = A2 > 0, or provide a min- 
imal model of metastasis formation, where B cells play 
the role of metastasized cells [2SH25] . When an advanta- 
geous (driver) mutation appears in a supercritical process 
(A2 > Ai > 0), model ([2| provides a detailed description 
of the progression of a tumor towards malignancy. This 
scenario occurs in models that follow the onsets of mul- 
tiple mutations [3l [20l [29H3Tj . Such a multistage model 
was recently found to fit clinical data (on the number of 
driver versus passenger mutations in cancer) remarkably 
well [19]. This demonstrates that branching processes 
can provide a quantitative descriptions of certain aspects 
of cancer. 
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Note that in model ^ mutations happen at arbitrary 
times, which can be either a reasonable simplifying as- 
sumption, or a realistic feature when, for instance, muta- 
tions are caused by UV radiation More often, how- 
ever, mutations happen during cell divisions j7j, which 
case is described by the scheme 



A AB rate u 



(4) 



instead of the third process of ([2]). The solution of this 
alternative process goes completely analogously to that 
of ([2|, and is discussed in Appendix [A} 

The two-type process ([2| and other multi-type branch- 
ing processes have been extensively studied [5HTT]. but 
the main focus was on the large time limit behavior. Our 
results, however, are exact for any finite time, which has 
relevance when comparing them to experiments. Multi- 
type branching process were generally considered in- 
tractable as even in the simplest cases one arrives to (gen- 
erally unsolvable) Riccati equations. Some notable ex- 
ceptions are the two-type Luria-Delbruck models, which 
describe bacteria growth in a dish, where cell death can 
be neglected. Indeed, in the case of no death /3i = /32 = 0, 
model ^ with ^ can be turned into a much easier 
Bernoulli equation, which leads to a simple solution [TT] . 
Recently, however, we solved [23j a two-type branching 
process with cell death Q, and this renewed the hope 
that Kendall's model of tumor growth is also solvable. 
We shall show that the processes ^ and Q are also 
analytically tractable. 

The rest of the paper is organized as follows. In Sec.|TT] 
we obtain the explicit solution of the process (2]) in terms 
of generating functions. Special cases of the solution 
when either A or B cells behave critically are discussed 
in Sec. |III| Knowing the explicit solution allows us to 
calculate Pm,n(i) for arbitrary times, but we also derive 
various asymptotic limits of the solution in Sec. |IV| Fi- 
nally, we draw conclusions in Sec. |Vj and relegate some 
details to the Appendixes. 



II. GENERAL CASE 

Let us first try to develop an intuition for the system 
([2]) based on exact results regarding average densities. 
The average number of A cells obeys the rate equation 



d{m) 
dt 



Ai(m) 



(5) 



Hence starting with one A cell we get 

(m) = e^i* (6) 

Similarly the average number of B cells evolves according 
to the rate equation 



d{n) 



v{m) -I- A2(n) 



(7) 



Using ^ and (n) (0) = we solve ([t]) to yield 



(n) — V 



A2 — Ai 
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Higher order densities satisfy more complicated equa- 
tions, but their solution is just a combination of expo- 
nentials. 

The full description of the dynamics of the two-type 
branching process ^ is provided by the probabilities 
Pm.n{t) of having m copies of A and n copies of B at time 
t. These probabilities satisfy the forward and backward 
Kolmogorov equations [TU]. In terms of the generating 
function 



i,n(0 



(9) 



m,n>0 



the forward equations reduce to a single linear partial dif- 
ferential equation. In this article, we shall use backward 
Kolmogorov equations which in the present case are a 
pair of coupled non-linear ordinary differential equations 

mm 

dtA = + Pi + iyT, - (1 + + u)A (10a) 

dt'B ^ a-i'B^ + 1^2 ~ {^2 + 1^2)'^ (10b) 

Here A and 23 are the realizations of the the generating 
function ^ which only differ by their initial condition: 
the notation refers to the type of the single initial cell A 
or B, that is the initial conditions are respectively 



A{x, y,t = 0) = X 
'Bix,y,t^O) = y 



(11a) 
(lib) 



Both the positive (gain) terms and the negative (loss) 
terms in (10a) and (10b) correspond to the individual 
processes in ([2| in a straightforward way, reflecting on 
what happens to the initial cell. For example, since the 
initial A cell divides into two A cells at rate one, we find 
an A'^ term with coefficient one on the right hand side 
of (10a). Note that the forward Kolmogorov equation is 



a first-order hyperbolic partial differential equation and 
hence one can analyze it using the method of characteris- 
tics. The equations for the characteristics are mathemat- 
ically identical to the backward Kolmogorov equations 
(10a) and (10b), so the analytical framework (forward or 
backward Kolmogorov equations) is the matter of choice; 
the latter description is a little more convenient as it does 
not involve intermediate steps related to the characteris- 
tics. 



Equation (10b) contains only 23. It is tractable and its 



solution, subject to (lib I, reads 



■3 = 1- 



Ao 



"2(1 - z) 



1 - 



A, 



"2(1 - y) 



(12) 



Plugging (12) into (10a) we find that X = 1 — A sat- 



isfies a first order non-linear differential equation 

f = -X^ + A,X + ^;^ 
dt 02(1 — z) 



(13) 
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the Riccati equation. A useful way to treat Riccati equa- 
tions is based on the standard trick |32j that turns a 
Riccati equation into a Hnear Sturm-Liouville equation. 
In the present case, this trick suggests to write 



(14) 



One finds that Z satisfies the following Sturm-Liouville 
equation 



iff-Z _ . dZ 



02(1 - z) 

Let us consider A2 > for an instant. In the t 



(15) 



limit, we have z — > and hence Eq. (15) turns into a 



linear equation with constant coefficients which admits 
an exponential asymptotic solution 



Z (X e 



a2 







(16) 



Using ( 12 ) we can re- write the above asymptotic solution 
as Z cx z'^^^^ when z 0. This suggests to seek the 
solution of Eq. (151 in the form 



Z{t) = z"/^^ $(z) 



(17) 



for arbitrary A2. Plugging (17) into (15) we obtain 
z(l - z)$" + [1 



^'' + ^'\l-zW^^^ (18) 
a2A2 



A2 



where the prime denotes the derivative with respect to z. 



Equation ( 18 ) admits a solution in terms of the hyperge- 



ometric function. Indeed, the canonical hypergeometric 
equation reads [32] 



z(l - z)^" + [c - {a + b+ l)z]$' - afe$ = 
and it has two linearly independent solutions 
F{a, b; c; z), z'^~'^-F{a - c+l,6-c+l;2-c;; 



(19) 



(20) 



Equation (18) coincides with (19) if parameters a,b,c 
satisfy 



a + b = 



2uj + Ai 
Ao 



lb = 



+ 6+1 (21) 



Using ( 16 ) and ([20| we arrive at 

$(z) = F{a, 6; c; z) + Cz^-^Fi-b, -a; 2 - c; z) 



with parameters 



a; 



LU + Xl 

A, 



2lu + Ai 
A, 



(22) 



(23) 



To complete the solution (17), (22)-(23) we need to 



know u! and C. Recall that uj is found from (16); the 
proper root is 



(24) 




Recalling previous definitions we have 

1 dZ A2Z d$ 

Z dt $ dz 

We now compute the derivative of the hypergeometric 



function using identity ( C4 ) given in Appendix [Cj The 

(25) 



original generating function A becomes 

A = l+uj + A2*(z) 

where we use the shorthand notation 

z^F3(z) + C(l - c)F2(z) + CzFiiz) 



v|/(z) 



with 
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(26) 



(27) 
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1 — 6, 3 — c; z) 



From (11a) we determine the value of the parameter 
KFiizo) - F^izo) 



(1 - c - KZo)F2(zo) + zoF^izo) 



with 



X — I — LU 
A2Z0 
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A2 

"2(1 - y) 



(28) 



(29) 



Now that we know the complete generating func- 
tion A{x,y,t), we immediately know e.g. the generating 
function of the distribution of the number of B cells, 
A{l,y,t), and that of the total number of A and B cells, 
A{x,x,t). We also know the probability of having no B 
cells, ^(1,0,^), or having no cells at all, ^1(0,0,^). We 
can extract Pm,n{t) from this solution by using Cauchy's 
integral formula 



1 



(2TTi) 



dx 



dy 

1;"+ 



^A{x,y,t) (30) 



Numerically, this inverse transformation can be efh- 
ciently performed via the fast Fourier transform algo- 
rithm |23|. 

The solution of the alternative process (|4|, where mu- 
tations happen at cell divisions, is given in Appendix [A] 



III. CRITICAL CASES 

Here we consider the special cases of critical dynamics 
of A cells (Ai = 0), that of B cells (A2 = 0), or both. 
If only A cells are critical, all formulas of the previous 
section can be simply taken in the Ai — > limit. If, 
however, B cells are critical (A2 — 0), expressions like 
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(231 are ill defined. This A2 = case for general Ai is a 



bit cumbersome, but also implausible, as it describes a 
super- of subcritical population producing a completely 
neutral mutant. Hence we only consider here the bi- 
critical case Ai = A2 = 0, that is when the birth and 
death rates of both cell types are balanced 1 — u = Pi, 

Oil = 

The behavior of critical B cells can be obtained from 
(fT2|) by taking the A2 — > limit 



y + ^2^(1 - V) 

l + a2t(l-y) 



(31) 



We substitute this back to ( 10a I, and for X = 1 — yi we 
arrive at 



dX{t) 
dt 



1 + a2t(l - y) 



or equivalently 



dX{T) _ 2a2T 2 2iy 
— 1 — X H 

in terms of the new variable 



t + 



0:2(1 - y) 



(32) 



(33) 



Again, by writing X{t) = log Z{t), we simplify 



(l32j) to Z" - Z'/t - 4Z = 0, which is solved by the 

(34) 



modified Bessel functions 

Z = T[h(2T)+CKi{2T)] 



up to an arbitrary constant. By using identities (CI) 



for the modified Bessel functions, solution (34) can be 



transformed back to the original generating function 

1^ /o(2t) - CXo(2r) 



A{x,y,t) = 1 



a2T /i(2t) + CXi(2t) 



(35) 



The parameter C is fixed by the initial condition (11a 
/o(2ro) - a2iy-^To{l - x)h{2To) 



C = 



Ko{2to) + a2iy-^To{l - x)Ki{2to) 



with 



To 



1 

a2 



i-y 



(36) 



(37) 



This bi-critical case solution is somewhat simpler than 



the general solution (25 1, as it only contains low index 



modified Bessel functions. 



IV. ASYMPTOTIC BEHAVIORS 



determine the right (or interesting) scaling limit. In the 
case of studying e.g. only the distribution Pm{t) of one 
type of cells (either A or B), we first guess the leading 
order large time asymptotic of the typical number of sur- 
viving cells x(^)i E^nd wc study the t ^ 00, m ^ 00 scal- 
ing limit, with m — m/x{t) kept constant. We define the 
scaling limit for the distribution P,n{t) of the number of 
cells as x{t)Pm{t) — > p(™), and note that the generating 
function becomes a Laplace transform, if we also take the 
X — > 1 hmit, while keeping ^ = x{t) log 5: ^ x(*)(l ~ 2;) 
constant 



m>0 



m>0 



(38) 



e ™'^p(rh)drh 



The scaled density p{rh) has both a singular part de- 
scribing the extinction of cells, and a regular part de- 
scribing the surviving cells. The distribution of cells con- 
ditioned on survival is by definition P^{t) = Pm{t)/S{t), 
and we define its scaling limit as x{t)Pm{^) ^ p*{rh). 



Now by treating the first term of the sum in ( 38 ) sepa- 
rately, up to first order we obtain 



A{x, t)^l- s{t) + s{t)A* (0 



with the Laplace transform 



[ni)drh 



(39) 



(40) 



where the survival probability is asymptotically S{t) ~ 
s{t). Consequently, p{rh) = [1 — s{t)\5{rh) + s{t)p*{rh). 
Our scaling describes the fate of all surviving cells if 
A* (^) is the Laplace transform of a valid probability den- 
sity, i.e. if A*{Q) — 1. But to obtain the right scaling 
of the surviving cell distribution, we first need to under- 
stand the asymptotic behavior of the survival probability. 



A. Survival probability 



From the exact expressions (35) or (25) for A{x,y,t), 
we know the survival probability of A cells Sa (^) = 1 ~ 
^(0,1, t), that of B cells Ssit) = l"^(l,0,i), and that 
of any cell S{t) = 1 — A{Q, 0, t) at time t. Let us now use 
these exact expressions to understand their large time 
asymptotic behavior. 

We first consider the bi-critical case Ai = A2 = 0. It 
is immediate that that the survival probability of A cells 
SA{t) ~ 1/i as t — >■ 00. What is the survival probabil- 
ity of any cells S[t) = 1 - yi(0,0,i)? As < — > 00, also 
T ~ vtj a2 — ^ 00, while C from (|36|) remains constant. 



Now using the large argument limits (C2) of the Bessel 



functions in (35), we find that terms containing C are 



asymptotically negligible and 



Now that we have the exact solution (25), let us derive 
some asymptotic scaling behaviors. The first task is to 



(41) 
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In this bi-critical case, of course, all survival probabilities 
Sa, Sb and S tend to zero, that is the eventual extinction 
is certain. But since S is asymptotically much larger than 
Sa, it means that surviving cells are typically B cells. In 
other words Ssit) ^ S{t), which is true in general for 
any parameter values due to the finite mutation rate v. 
This strange behavior, that neutral mutants are more 
likely to survive than non mutants, can be generalized 
for multiple type processes, with successive mutations 
Ai A2 A3 . . . . It turns out that that the survival 
probability of A, is asymptotically Si oc [53]. 

In the supercritical case the survival probability is 
positive, hence the leading order asymptotic is constant 
S{t) — > = 1 — -^(0, 0, 00). Let us first assume that B 
cells are supercritical A2 > 0. In this case z{t = 00) = 0, 
and since from ([23|) and ( 24 1 




4;/ 



> 



we find that equation ( 25 ) reduces to 



A{0, 0, 00) = 1 + u; + A2(l - c) = 1 



(42) 



Using (23)-(24| we can re- write the probability to end 



up in the state without cells as 




UJ + \l 



(43) 



For A2 ^ we recover the classical single type result 
Soo = 1 — l3i ~ v for f5i + v < I, and Soo — otherwise. 
For A2 < there is no effect of B cells on the survival 
probability, hence the result is the same as for A2 0. 
Note again that in the leading order Ssit) ^ S{t) for 
t — > 00 due to the finite mutation rate v. Note also 
that the asymptotic survival probability Sqo can be ob- 
tained without the explicit knowledge of A{x,y,t), by 
solving the equations for A{0, 0, t — ^ c») derived from 
( 10 1. The above calculation therefore provides a check of 



self-consistency. 



Critical case 



Many interesting limits are worth to investigate in the 
simplest bi-critical case, Ai = A2 = 0, that is when the 
birth and death rates of both cell types are balanced 
a2 = P2 and Pi = I — v. We are going to describe the 
large time scaling behavior of Pm,n{i) in three naturally 
arising regions of the (to, n) plane: (i) corresponds to 
extinction m — n — Q] (ii) corresponds to the survival of 
only B cells to = 0, n > 0; and finally (iii) corresponds to 
the survival of both types to, n > 0. For all cases of this 
bi-critical process, the average number of cells behave as 



(to) 



1, 



(44) 



but these averages include extinct lineages as well. To 
find the correct scaling for each regions we need to know 
the corresponding typical behaviors. 



We understand the behavior in region (i) since in (41 ) 
we already obtained that all cells go extinct asymptot- 
ically with probability 1 — \J~vJ0i2t. Hence that main 
weight of Pra,n IS asymptotically concentrated at the ori- 
gin. 

Before exploring the (to, n) plane further, as a warmup, 
let us first consider the behavior of A cells alone, which 
sheds some light on the behavior of region (iii) . Type A 
cells are not affected by B cells, and behave as a simple 
critical process, with generating function 



A{x, l,t) = 



t{l -x) + l 



(45) 



The average (and typical) number of surviving cells 
is asymptotically x(i) = t, hence following the above 
method, we need to take the t, m — cx) and x — > limits 
with m = m/t and ^ — t(l — x) constant. In this limit 
we recover scaling (39 1 with A*{£,) = 1/(1 -I- ^), with 
Sa ^ 1/t. By an inverse Laplace transform we obtain an 
exponential density p* (to) — e~™ for the scaled number 
of the surviving cells. 

Now lets look at B cells as well. As we have shown 
in Sec. |IV A[ for large times if there are surviving cells, 
they are typically B cells only, hence the next largest 
weight corresponds to region (ii), that is to m = 0, n > 0. 
Therefore, we are interested in the distribution Pn{t) of 
B cells, irrespective of A cells. The generating function of 
Pn{t) is simply A{\, y, t). Next we need to guess the typ- 
ical number of surviving B cells in the asymptotic limit, 
which we set to x{t) — ct2t- The reason is that before 
the extinction of A cells a few B cells are produced, and 
they behave as a single type critical branching process 
of rate a2, which we just discussed above. This guess 
will be justified by the scaling function we obtain, i.e. by 
A*{0) — 1. Alternatively, we could have just assumed a 
general algebraic anzatz for x(t), and establish its linear 
time dependence from the scaling. 

Hence we take the scaling limit t — > 00, ?/ — >■ 1, with 
77 — 02^(1 — y) kept constant, and from (33) and (371 we 
find that both 



To 



0.211' 



(46) 



diverge as i — >■ 00. Note that although tq depends only 
on y in ( |37[ ) , we formally turned that into a t dependence 
from rj — a2t{\ — y). Therefore we use the large argument 
asymptotic of the modifies Bessel functions (C2|, and 
note that a; = 1, to obtain from (361 the asymptotic 
behavior of the parameter C ~ e^'^^' /tt. We see from (46) 



that r > Tq in the scaling limit, hence in (35) the terms 



proportional to C become negligible, and the generating 
function simplifies to A{l,y,t) = 1 — vla2T. From this 
we recover the scaling form ( 39 ) with 



A*{^) = \- 



V 



(47) 
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FIG. 1: Scaling of the probability P„(t) of having n copies of 
B cells at time t in the bi-critical case, that is when death and 
birth rates are balanced for both cell types, ai — 02 — 132 — 1, 
while Pi = 1 — V. Here we present plots for mutation rate 
1/ = 0.1 at several different times, calculated numerically 
from the exact generating function ( |35[ ). In the simultane- 
ous limit f,n — >■ 00 and n = n/x{t) being constant with 
xit) = a2t, the scaled distributions converge to a scaling limit 
x{t)P„{t)/SB{t) p*{n) given by ([48|, where Sb ~ \/v/a2t. 



and with Sb ^ \/v ja-it. Note that yi*(0) = 1 as re- 
quired. Now, by taking the inverse Laplace transform 
of the above expression (e.g. by integrating around the 
branch cut between rj — ~1 and 0), we find that the 
surviving B cells are distributed as 



p*(n)=e-»/2 [/„(,V2)-/i(n/2)] 



(48) 



with the scaling variable n — n/a2t. Interestingly, the 
scaling function is independent of the mutation rate v. 
Note also that the above distribution has an algebraic 
tail p*{h) cx n~^/'^ for h — > 00, which implies an infinite 
average number of cells. This is understandable, since the 
average number of surviving B cells grows as cx t^^"^ from 
( 41 1 and ( 44 1 , which is faster than the typical value cx t 



we used for the scaling. The convergence to the scaling 
function p* (n) for large times is illustrated in figure [T] 

What is the distribution of cells when both type of 
cells are present, that is in the "bulk" region (iii)? Note 
that the survival probability that cells of both types are 
still present at time t scales asymptotically as \/t. Since 
A cells behave critically, the number of surviving A cells 
grows linearly with time, and consequently the typical 
number of B cells grows as vt^ . This suggests to keep m/t 
and n/vt^ constant as t — 00. Hence in the generating 



function (35) we should take the t — ^ 00, x, y — ^ 1 limit, 
with ^ — t{l — x) and 77 — i't^{l — y) constant. In this 



limit, from (33 1 and (37), we find again that both 



To 



T = To 



2 



(49) 



diverge as t — >• co. Consequently, all terms of parameter 



o4to 



c 



(50) 



Therefore all terms containing Bessel functions in ( 35 ) 
have the same order, and we finally find that 



(51) 



The 



scaling function 

i2\ 



of surviving A and B cells 



p* {m/t,n/vt ) is the inverse Laplace transform of (51) 
in both variables ^ and t], which we could not obtain 
explicitely. 



C. Supercritical case 

There are many possible limits to take in a non-critial 
system, where Ai,A2 ^ 0. For example, one could con- 
sider all six possible orderings of 0, Ai,A2. Here we re- 
strict our attention to the most interesting case, when an 
already advantegeous cell lineage produces an even fitter 
mutant, i.e. A2 > Ai > 0. Again, we are interested in 
the distribution P„(i) of B cells, irrespective of A cells, 
which is given by Ail, y, t). 

First, let us provide an intuitive reasoning for the scal- 
ing, which will be justified later. Since A2 > Ai, the first 
surviving mutant produces the dominant portion of B 
cells. A single B cell eventually survives with probabil- 
ity \2I0L21 and the average number of B cells (including 
extinct hneages too) grows as e^^^*. Consequently, the 
average number of surviving offspring of a single initial 
B cell grows as x(i) = f^e"^^* for t — ^ cx), which we set 
as the typical number of surviving cells for the two type 
process. 

Hence, we study again 71(1, ?/,t) in the scaling limit 
^ ^ 00, y ^ 1, with T] = x{t) log l/y - v) being 

constant. Let us first determine the asymptotic value of 
the parameter C given in (p8|. In the scaling limit the 



1 



00 



parameter zq of (29) diverges: 
for A2 > 0. Thus we need to determine the asymptotics 
of functions Fj[zo) which appear in (28 1. Since kzq = 
—a (this follows from (29) and (23 1; also recall that we 
consider a; = 1), the parameter C is actually given by 



C 



c-i aFi{z(i) + 20^3(^0) 
' ~hF2{zn) + z^Fiizo) 



(52) 



Note that Ai > implies a <b from (23). However, here 
we obtain the limit of C for the case a > b, since this 
derivation is simpler, and the result is the same due to 



C of (36) have the same leading order behavior, and we the analyticity of C{a,b). Using the identities (C5) and 
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( C6 ) for the hypergeometric functions one finds 



D. Small mutation, large time limit 



Fi(zo)- (-zo) 
F2{zo) ^ i-zo) 



Tic)T{a~b) 

r(i + a)r(a) 

, r(2-c)r(a-6) 

r(-&)r(i-6) 



(53) 



Fi{zo) 



~Fi{zo) 
--F2{zo) 



as |zo| oo. Substituting (53) into (52) we obtain a 
neat expression for the parameter 



b r(c) 



a r(2 - c) 



T{a) 



(54) 



Note that this rcsuh is vaUd for arbitrary values of a and 
b. 



It is clear from (25) that the generating function 



A{l,y,t) of Pnit) depends on t and y only through C 
and z. We have just obtained C in the scaling limit (54), 
and from ( 12 ) we find that z ~ — I/??- Hence analogously 



to the limiting form of (39), from the definition (25) we 



obtain the Laplace transform of the surviving B cell dis- 
tribution in the scaling limit 



A*{rj) = 1 + 



u; + A2*(-l/?7) 



(55) 



where S{t) — >• Sao = w + Ai of (43) is the leading order 



constant survival probability of B cells, ^(z) is given by 
( 26 ) with C of ( 54 ) . Therefore technically all we should 



do is to inverse the Laplace transform ( 55 ) to obtain the 



scaling function p* (n) for the surviving B cells, with 



n 



As 



(56) 



Although this is easy to do numerically, analytically it 
is quite challenging, since the function ^'(z) involves the 
ratio of hypergeometric functions. This provides an ex- 
ample for a simple process where even the large time limit 
behavior is very complicated. 

One general comment concerns the behavior of the 
scaling function in the limit when the scaled variable 
n is large. We anticipate a generic exponential tail. 



p*{n) 



Ue- 



oo, which implies that the 



Laplace transform has a simple pole at h 
residue U, that is, 



U 



-u with 



(57) 



To determine m, U we must therefore find a real and pos- 
itive zero of function z^~^Fi(z) -|- CF2{z). For special 
cases of integer b, we can explicitly compute the scal- 
ing function p*{h), and verify the exponential tail. We 
relegate these details to Appendix \B\ 



In many biological applications the mutation rate 
refers to a single base pair change, which is typically very 
small (in human DNA the probability of such mutation 
is around 5 x 10~^° per base-pair per cell division j34j ) . 
Hence in this subsection we study the joint large time 
small mutation rate limit. 

We consider again the case where both types are ad- 
vantageous, and also A2 > Ai > 0. We take the — > 
and i — ?> cxD limits. As v ^ 0, we have Ai — ^ 1 — /3i, and 



using (23)-(24) we obtain 



a ^ aiv, ai 



1 

a2(l-/3i)^ 



A2 



(58) 



where < 60 < 1- Also, we anticipate the corresponding 
scaled number of B cells n — 0, then 77 — )■ 00, hence also 
z ^ —l/r] 0. Therefore we take a simultaneous v — )■ 
0,7/ — > 00 limit, while keeping 77(01;^)^/'''' constant. The 
reason for keeping this particular combination constant 
will become clear later. 



The smallest order terms of (27) are particularly sim- 
ple due to ( C3 ) 



Fi(z)^F2(z)^l 
aibo 



Fz{z) 



1 + 60 



V, Fi{z) 



while from ( 54 ) 



C ^ aivBoe 



TTbg 



Bo 



aibp 



sin nbo 



Substituting these limits into ( 26 ) we obtain 

-bo 



l + B^'lrjia^.y/bo] '° 



(59) 



It is clear that we obtain the only nontrivial limit when 
the new scaling variable ( ~ {aivy^''°r] = x(t, z/)(l — y) 
is kept constant, with 



^e^^*(ai7.)i/''° 
A2 



(60) 



Since for — !■ the survival probability ( 43 ) is s. 



1 — /?!, and uj ^ X2aiv, the scaling functional* from (55) 
becomes 



yi*(C) 



1 



1 + SoC''" 



(61) 



This scaling function depends only on the single param- 
eter < feo < 1- The corresponding scaled number of 
B cells is 77 = n/x{ii i^), with the scaling function p*{n). 
This generating function has been studied numerically in 
[TH], and it has already been derived in a very appealing 
approximate model in [20] . 



8 



0.01 



0.001 



0.0001 




0.01 



0.001 



0.0001 




FIG. 2: Scaling of the probability Pn{t) of having n copies 
of B cells at time t in the supercritical case for the simul- 
taneous limit of large time, large number of cells and small 
mutation rate, n — >■ oo, i/ ^ with n = n/x{t, v) constant 
and xi^j^) given by (601. We chose oi\ 



1, Pi 



0.81, 



a2 = 1.29, P2 = 0.91, which correspond to the scaling pa- 
rameter bo = 1/2. Plots are for mutation rate u = 0.0001 
at different times, cal culated numerically from the exact gen- 
erating function (25 1. The scaled distributions converge to 
the limit function !/)P„(i)/S_B(f) — )■ p'{n) given by (62l, 
where S'b — >■ 1 — /3i . 



Let us explore the properties of p*{n). For the spe- 
cial case 60 = 1/2 we can evaluate the inverse Laplace 
transform oi A* {Q) from (61) explicitely 



p*{n) 



1 



- e'V-^Erfc^ 



(62) 



For general b^, from the large 77 limit of A*{C,), we can 
obtain the small h series of the scaling function 



p*{n) 



-1 



1^ (-i?o)^T(6oj) 



(63) 



where > is given by ( 59 ) 



Indeed, by taking the 
Laplace transform of the forma series expansion of p* {fi) 
around n — Q and equating it term by term to the series 
expansion oiA*{C,) around C — 00, we obtain (63). Con- 
versely, from the small 77 asymptotic of A*{() we obtain 
the large n series 



p*ih) 



(64) 



This is achieved similarly to the small n series, but some 
care is needed. To avoid certain singularities, the se- 
ries expansion of the derivative -^A*{() should be com- 
pared to the Laplace transform of the expanded np* (h) . 



FIG. 3: Scaling function p*{n) of B cells for the supercritical 
case in the large time small mutation limit evaluated from 
(631. We plotted this function at four v alue s of the single 
parameter bo- Note the explicit formula (62 I for 60 = 1/2. 



The small and large h asymptotic power law behavior can be 
observed. 



truncating it after a finite number of terms provides an 
excellent approximation |32j . 

There are two surprising features of the scaling func- 
tion p*{h). First, from the small n series (63) we find 
that the density is singular at n = as p*{n) oc h''°~^, 
where — 1 < 60 ~ 1 < 0. Second, from the large h se- 



ries (64) we find that the density has a power law tail 
p*{n) oc with the exponent being 1 < 1+bo < 2. 

This means that this density function has an infinite av- 
erage. The scaling for large times is demonstrated in 
figure [2] while the scaling function p*{fi) is depicted in 
figure [3] for several values of its only parameter 60 ■ 

The power law tail of p* (n) has been observed numer- 
ically in [IH]. Note also that we have already encoun- 
tered a limit density with infinite average in Sec. |IVB[ 
for a critical process for arbitrary mutation rate. Here, 
however, a power law density appeared in a non-critical 
situation, where we expect exponential tails in general 
(see Appendix IbI). 



V. CONCLUSIONS 

In this paper we presented an explicit solution for a 
general two-type birth-death process with one-way mu- 
tations. This process was proposed by Kendall [H] to 
model the onset of a driver mutation in an evolving 
cell population. We computed the generating function 
(p5|)~(|27l) that encapsulates the probability distribution 
Pm,n{t)- The generating function can be turned into the 
exact probabilities Prn,n{t) by a fast Fourier transform 



Note that the small n series ( |63[ ) has an infinite radius 

of convergence. Conversely, the large n series ( 64 ) has it involves the ratio of hypergeometric functions 



Our explicit result ([25 



( |27[ ) is not easy to grasp, as 
It be- 



zero radius convergence, but it is asymptotic and hence comes somewhat simpler (35) in the bi-critical case, that 
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is when both cell types behave critically Ai = A2 = 0. 
The complete solutions, however, contain all information, 
for instance one immediately knows the survival probabil- 
ity of cells, or any order moments of the cell distribution. 
We have also extracted several interesting limits from 
the exact solutions. In the bi-critical case we showed in 
Sec. |IVB| how different types of scaling apply in different 
regions of the (to, n) plane. For the onset of an advante- 
geous mutation, A2 > Ai > 0, we derived the large time 
scaling limit, which limit function still involved hyper- 
geometric functions. This demonstrates that the large 
time limit behavior can be still very complicated. For 
certain special cases though, we derived the explicit scal- 
ing functions, with generic exponential tails. Conversely, 
in the simultaneous large time and small mutation limit, 
the scaling function has a power law tail, with infinite 
average value. As a consequence, an enormous number 
of samples are needed in simulations to recover the exact 
(finite time) average values. 

There are of course numerous interesting open ques- 
tions of two-type branching processes which deserve fur- 
ther attention. The long term description of tumor de- 
velopment though must involve more than two cell types 
corresponding to multiple stages of tumors O UHl 155] . 
These multi-type branching processes are likely too cum- 
bersome for explicit solutions, and one needs simplifying 
assumptions to deduce the relevant asymptotic behav- 
ior [201113]. Nevertheless, some effective models recently 
proved to be remarkably successful in connecting theory 
and clinical data '19'. 
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Appendix A: Mutation at cell divisions 

Here we briefly discuss a version of the two type process 
( 2|), where mutations happen at cell divisions Q. In 
(10a I the term i/A should be replaced by vA'B; all the 



rest in ( 10a)-( 10b ) remains the same. We follow the exact 



same steps as for the original model, and we arrive at the 
same solution (25l-(29l, but with parameters 



A, ' 



b = 



a; + 1 - /3i 



c= 1 



b- 



general forms 

F2{z) ^ F{1 + a - c,l + b ~ c;2 - c; z) 
(l + a-c)(l + 6-c) 



2-c 

X F{2 + a - c, 2 + fc - c, 3 - c; z) 



since in ( 27 ) we simplified these general formulas by using 
c = I + a + b, which is not true in the present case. 



Appendix B: Case of integer b 



In Sec. |IV C| we conjectured that the scaled distribution 
p* (n) has an exponential tail in general. In this Appendix 
we review a few concrete examples for the special case of 
integer b valu es. Recall that we assumed A2 > Ai > in 
Section IV C which implies 5 > a > from ( 23 1 . We do 



not impose any other restriction on a. 

Equation ( 54 1 shows that C diverges when 6 is a non- 



negative integer. This case corresponds to particular val- 
ues of the mutation rate 

_ l-f3i- 6A2 
l-(5a2)-i 



from the definitions (16) and (23), provided that < 
v < 1 — (3i. Our sole reason to consider these special 
mutation rates is to make the problem more tractable, 
and we anticipate that the behavior is qualitatively sim- 
ilar for arbitrary mutation rates. Indeed, when C — 00, 



equation ( 26 ) simplifies to 



* = 1 - c + z 



F2{z) 



and hence ( 55 ) becomes 



A*{^l/z) = l + 



1 



uj + A2(l — c) + A2 2; 



F2{z)_ 

Note that 77 = — l/z in this limit. The constant term on 
the right-hand side vanishes, see (42), and therefore we 
finally arrive at 

X2zFi{z) z d\ogF2 



A*{-l/z) 



.F2(z) 



dz 



(Bl) 



Here we also used that S00/A2 — b from (23) and ( |43[ ). 
Note that this relation provides another interpretation 
for integer b. Let us consider now concrete examples. 
Since b > 0, the simplest possible choice is 6 = 1. 



a2 



1 



and 




instead of (23) and (24). Also, while ^"1(2:) and ^3(2) are 



unchanged from (27 1, we need -^2(2) and F^lz) in their 



Since c : 

F2 



a + 6+ l = a + 2we conclude that 

F(—l, —a; —a; z) — 1 ~ z, ~ —1 



Therefore Eq. (Bl) becomes A*{—l/z) = z/{z — 1). In 
other words. A* {rj) = 1/(1 + rj), from which p*{n) = 
exp(— n). Thus in this case the scaled distribution of B 
cells is a pure exponential. 
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b = 2 



Appendix C: Useful formulas 



We have c = a + 6+ l = a + 3 and 



F2 = F(-2,-a;-l - a;z) = 1 - 



2a 
a + l 



a - 1 
a + l 



dz 



2a 
a + l 



„ a - 1 
2 z 

a + l 



Equation (Bl) becomes 



a — 1 + arj 



a - 1 + 2a?7 + (a + l)ry2 
and its inverse Laplace transform 



p*{h) 



a-1 
a + l 



cxp — n 



a-1 
a + l 



is a combination of two exponents. It is possible to obtain 
explicit expressions for larger b values in particular, but 
let us consider the problem now in general. 



c. Arbitrary integer b 

In this case F2 = F{—b, —a; 1 — b — a; z) is a polynomial 
of the order b; hence F4 is a polynomial of order b—1. It 
is easy to see that A* [rf) is the ratio of a polynomial of 
•q of the order 6 — 1 to the polynomial of of the order 
b. Therefore one can write an exact expansion 

Thus when b is integer, the scaling function is a combi- 
nation of b exponentials: 

6 



Here we list a few identities and limits of special func- 
tions from [351 [37] . The modified Bessel functions satisfy 



21[{z)=h{z)^h{z) 
-2K[iz) = Koiz) + K^iz) 
2h{z)lz = Io{z)-h{z) 
-2Ki{z)lz = Ko{z)-K2{z) 

and their z — >■ cx) asymptotic behavior is 



(CI) 



(C2) 



/25Z ' - ' V 22 
The hypergeometric function has the series expansion 



F{a, 6; c; z) = 1 



ab z a{a+ 1)6(6+ 1) 
Vl! ^ c(c+ 1) 21 



(C3) 



and the derivative 



-^F(a, 6; c; z) = — F(l + a, 1 + 6; 1 + c; z) (C4) 
az c 



We use the transformation formulas 



F{a, 6; c; z) = (1 - z^Fia, c - 6; c; 1 - 1/z) 
= (1 - zy^F{b, c - a; c; 1 - 1/z) 



(C5) 



and the identity 



F(a,6;c;l) 



for Rec > Re(a + b). 



r(c)r(c-a-6) 
T{c- a)T{c-b) 



(C6) 
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